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Abstract. We introduce a new optimal transport distance between 
nonnegative finite Radon measures with possibly different masses. The 
construction is based on non-conservative continuity equations and a 
corresponding modified Benamou-Brenier formula. We establish various 
topological and geometrical properties of the resulting metric space, de¬ 
rive some formal Riemannian structure, and develop differential calculus 
following F. Otto’s approach. Finally, we apply these ideas to identify a 
model of animal dispersal proposed by MacCall and Cosner as a gradient 
flow in our formalism and obtain new long-time convergence results. 

1. Introduction 

In the last decades the theory of Monge-Kantorovich optimal transporta¬ 
tion problems has seen spectacular developments and provided new powerful 
tools, deep insights, and numerous applications in functional analysis, partial 
differential equations, and geometry. The list of references on this topic is 
steadily growing and we refer to |44) for an extended bibliography. Central 
ingredients to the theory are the Kantorovich-Rubinstein-Wasserstein dis¬ 
tances and their variants, usually defined between probability measures. In 
their seminal paper |30], Jordan, Kinderlehrer and Otto showed that certain 
diffusion equations can be interpreted as gradient flows with respect to the 
Wasserstein metric structure. This was later pushed further by Otto [39], 
who showed that the space of probability measures can in fact be endowed 
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with a formal Riemannian structure induced by the (quadratic) Wasserstein 
distance. This differential structure then led to new connections between long 
time convergence of measure dynamics and functional inequalities via, e.g., 
entropy dissipation methods, Talagrand and logarithmic Sobolev inequali¬ 
ties, Bakry-Emery strategies. See nniQiii for a review, and also Section |3] for 
a brief discussion. 

One of the main restrictions of the Wasserstein distances is that they are 
limited to measures with fixed identical masses, and the theory requires uni¬ 
form tightness (control of decay at infinity via the p-moments). Recently, 
some efforts [2311101 HU were made to construct new optimal transport dis¬ 
tances between measures with different masses, and some non-mass preserv¬ 
ing reaction-diffusion systems were interpreted as gradient flows miiaa]. 

In this paper we introduce a new distance on the set of nonnegative finite 
Radon measures by means of a modified Benamou-Brenier formula. Our 
approach allows for mass variations and does not require tightness or decay 
conditions such as finite moments. The classical Benamou-Brenier formula jH 
Hia] has an interpretation that the squared quadratic Wasserstein distance 
VV|(p 0 )/Oi) between two probability measures po ^-nd pi (with finite second 
moments) is the minimum of the Lagrangian action of the kinetic energy 
during all possible ways of transporting the original distribution po of moving 
particles to the target one pi via continuity equations dtp + div(pv) = 0. 
In a similar spirit, our distance has two interpretations: a mechanical one 
through motion of charged particles (described later on in Section 13.31) , and a 
biological one through fitness-driven dispersal of organisms described below. 
For both points of view the crucial role is played by the non-conservative 
continuity equation 

dtp -\- div(pVn) = pn, 

which allows for mass variations through the reaction term pu appearing in 
the right-hand side. Biologically, p(t, x) can be viewed as the time evolution 
of the spatial density of living organisms, and u{t, x) as an intrinsic character¬ 
istic of population called fitness (cf. |17| ITS] ). The fitness manifests itself as 
a growth rate, and simultaneously affects the dispersal, as the species move 
along the gradient towards the most favorable environment. The equilibrium 
u = 0 is called the ideal free distribution |261 [25] , since no net movement of 
individuals occurs in this case. Our (squared) distance is the minimum of 
the Lagrangian action of the total energy, which is the sum of the kinetic 
energy p|Vup and of the potential energy p|up representing deviation from 
the ideal free distribution. 
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The advantages of our distance are that it possesses a rich underlying geo¬ 
metric structure and is easy to handle heuristically by the optimal transport 
intuition. It metrizes the narrow convergence of measures, and is lower- 
semicontinuous with respect to the weak-* convergence. We prove that the 
metric space under consideration is a complete geodesic space. The com¬ 
pactly supported measures are dense in this space. The Lipschitz curves (in 
particular, the geodesics) in this space have a clear characterization. Our 
distance endows the space of hnite Radon measures with a formal Riemann- 
ian structure, and we are able to introduce a first- and second-order calculus 
a la Otto. 

The htness-driven dispersal model suggested in [321 E] and studied in 
|19j (see also |9]) turns out to be a gradient flow in our formalism. This 
allows us to show that the solutions to this problem exponentially converge 
to the ideal free distribution. The convergence itself was proven in |T^ by 
contradiction and thus without any rate. Related htness-driven two-animal 
models were investigated in |in| [36] . In forthcoming papers |3ll|32] we study 
an ecological model for several interacting animal populations by observing 
that it is also a gradient how with respect to our structure. We also refer 
the reader to (IIl|22l|22ll28ll29lll2] and references therein for applications of 
metric structures in spaces of measures (and, in particular, of the bounded- 
Lipschitz distance), e.g. in the context of population dynamics, pedestrian 
hows, and Markov chains. 

For simplicity we restricted here to the quadratic cost pdVrrp -|- |np), 
leading to the aforementioned formal Riemannian structure in the spirit of 
Otto. One could otherwise choose costs of the form p\Vu\P, p\u\'^ for different 
exponents p,q > 1 and construct a whole family of distances dp^q, but those 
would not enjoy the same fashionable Riemannian structure (one should then 
rather speak of tangent cones). This is similar to the dynamic formulation 
of the Wasserstein distances Wp of order p E 03], and the gradient hows 
with suitable driving entropies can involve the p-Laplacian operator as in |T]. 
This might be applied to identify different nonlinear reaction-diffusion and 
population models as gradient hows with respect to some dp^q distances, yet 
those metrics can be useful in various other applications involving quantities 
of variable mass. 

In |20j the authors constructed a new class of pseudo-distances on the space 
of non-negative Radon measures via modihed Benamou-Brenier formulas, 
and as a result some techniques are similar to the ones here. They employed 
conservative continuity equations but with nonlinear mobilities. On the other 
hand, a non-conservative modifed Benamou-Brenier formula (different from 
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ours) appears in m- More specifically, a class of distances Wp on the space of 
non-negative Radon measures is constructed in gnisi] using perturbations 
of the original measures in order to obtain measures of equal mass, and 
consequent minimization of the sum of the classical Kantorovich-Wasserstein 
optimal transport cost and of the price of the perturbations which is measured 
with the help of the total variation distance. These distances, exactly as 
ours, metrize the narrow convergence of measures. The distance Wi coincides 
with the bounded-Lipschitz distance, and the distance W 2 between absolutely 
continuous measures can be calculated by minimizing a Lagrangian in the 
manner of Benamou-Brenier with the energy |h|-|-/9|vp, where dtp+div{p-v) = 
h. 

The paper is organized as follows: The new distance is defined in Section |2l 
where we establish the aforementioned topological properties and character¬ 
ize Lipschitz curves and geodesics. Section [3] is devoted to the Riemann- 
ian formalism and differential calculus a la Otto. We introduce the notion 
of trajectory geodesics allowing us to develop the second-order calculus via 
some Hamilton-Jacobi equations, and also present some explicit and illus¬ 
trative computations for one-point measures. In Section H] we exploit this 
Riemannian formalism to identify the fitness-driven ideal-free distribution 
model Ian Eg [H] as a gradient flow and retrieve new long-time convergence 
results. Finally, we opted for moving several auxiliary results, including a 
new entropy-entropy production inequality, to the Appendix (Section |5|). 


Note during final preparation. After finalization of this article we be¬ 
came aware of the parallel and completely independent works |16[ l3l] , whose 
preprints appeared almost simultaneously with ours. In these studies, the 
same distance is constructed, but with different approaches and techniques, 
and different types of results are proved. We also refer to lag Eg for some 
extensions and applications. 

In |16| . the authors considered the continuity equations dtp+div{pv) = pu 
for independent velocity fields and reaction rates, and the dynamical cost is 
then minimized among all couples (n, v). It is not difficult to check at least 
formally that any minimizer (n, v) must satisfy v = Vn. Thus the distance 
constructed therein is the same as ours, even though we restrict ourselves to 
potentials (u, Vu) from the beginning. 
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Notation and conventions. 

• Throughout the whole paper and unless otherwise specified we will 
always denote by 

7W+ = 7W+(M'^) 

the set of nonnegative finite Radon measures in 

• We use the following notation for sets of functions: 

Cb- bounded continuous with ||0||oo = sup|(;^l; 

C^: bounded with bounded first derivatives; 

C^: smooth compactly supported; 

Cq'. continuous and decaying at infinity; 

Lip : bounded and Lipschitz with ||0||Lip = ||V0 ||cxd + ||';^||oo- 

• Given a sequence and p S we say that: 

(i) converges narrowly to p if there holds 

V(/> e C;,(]R'^) : lim f (p{x)dp^{x) = [ 4'{x)dp{x). 

jRd. jRd 

(ii) p^ eonverges weakly-* to p if there holds 

V(^GCo(ffi‘^): lim f (p{x)dp^{x) = [ (f){x)dp{x). 

jRd- jRd 

• Given a measure pq £ and a continuous function F : —)• R'^, 

the measure F^po is the pushforward of po by F, determined by 

/ </>d(F#po) = / (poFdpo 

JRd JiRd 

for all test functions (j) G Cb{^^)- 

• For curves t G [0,1] eA pt G we write p G C^([0,1]; for the 
continuity with respect to the narrow topology. 

• Given a nontrivial measure p G A4~^ we will denote by H^{dp) the 
Hilbert space obtained by completion of the quotient by the seminorm 
kernel of the space (R'^) equipped with the Hilbert seminorm 

ll'^llHi(dp) = [ (|V(/'(x)p + |(^(x)p)dp(x). 

JRd 

It is not difficult to check by functional analytic tools that the 
Hilbert space F[^{dp) can be identified with the set 

{u = (i(u), j(u)) I i(u) G L2(dp;R), j(u) G L^dp^R'^), C CUR^), 
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lim (j)^ = z(u) in L^(d/?;M), lim V(/>^ = j(ii) in M'^) i. 

k—xx) fc—!*-oo J 

Note that in general the elements u € Il^(dp) are not fnnctions, 
and j(u) is not the distribntional gradient of i(u). As a matter of 
fact, neither i(u) nor J(u) are distribntions in general, bnt i(u) dp and 
j(u) dp are, and only the latter “prodncts” will be considered throngh 
the paper. Nevertheless, for the sake of intnition and presentation, 
we will slightly abnse the notation and simply write u instead of 
i(u) and Vu instead of j(u). In other words, one should think of 
elements u in ff^(dp) as couples {u, Vn) with u £ L^{dp; M) and Vtt € 
L^(d/9;M'^). It is worth pointing out that Vn does not necessarily 
represent the derivative of u in any sense unless p is smooth enough. 
For example, when p = our space Ff^(d(5o) is isometric to M x 
and the second “gradient” component j(u) = Vu G L^(d5o;K'^) = 
cannot be retrieved from the sole knowledge of i(u) = u G L^(d(5o) — 
M by “differentiating” u. Alternative existing definitions of Sobolev 
functions with respect to measures (see, e.g., ElE] and the references 
therein) are less suitable for our purposes. Observe that the Hilbert 
norm in H^{dp) coincides with 


u 


m{dp) 


[ (|j(u)(a;)P + K(u)(x)p)dp(x) 

jRd- 


[ (|Vu(x)p + |u(x)|^) dp(x). 

jRd. 


Note also that if u G L^(dp) is O^-smooth and bounded, then u = 
(u, Vu) G H^{dp), where V stands now for the classical gradient. 

• Given a narrowly continuous curve p G Ci„([0,1]; A4^) we will denote 
by A^(0,1; Lp‘{dpt)) the Hilbert space obtained by completion of the 
quotient by the seminorm kernel of the space C^((0,1) x M'^) equipped 
with the Hilbert seminorm 

ll<^lli2(0,l;L2(dpO) = \^{t,x)\‘^dpt{x)^ dt. 

By construction, this space is isometric to the space L^((0,1) x 
M.^,dp), where the measure dp = dt ^ dpt G A4+((0,1) x M'^) is 
defined by disintegration as 

V(/> G C6((0,1) X M'^) : jj 4i{t,x)dp{t,x) := j ^4'{t,x)dpt{x)^ dt. 

(0,1) xRd 
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Take any u € T^(0,1; L‘^{dpt)) and a sequence {4>^} C C^((0,1) X 
converging to u in L^((0,1) x R'^,d^). Passing to a subsequence if 
necessary, we may assume that for a.a. t G (0,1) there exists the 
L^(d/Ot)-limit of The limit ut := lim G L^(dpt) is well-defined 

k^oo 

(does not depend on cj)^) for a.a. t, and 

ll“lli2(0,l;L2(dpt)) = [ ll“tllL2(dpt)dt. 

^ 0 

• Given a narrowly continuous curve p G C«,([0,1]; we will denote 
by -L^(0,1; Pf^(d/?t)) the Hilbert space obtained by completion of the 
quotient by the seminorm kernel of the space C^((0,1) x R'^) equipped 
with the Hilbert seminorm 

ll<^lli2(0,l;J7l(dpt)) = ^ (|V(/>(t,x)p \(l){t,x)\‘^) dpt{x)^ dt. 

As above, one can prove that L^(0,1; H^{dpt)) can be identified with 
the set 


u = (i(u),j(u))|i(u) G L2(0,l;L2(dpt)),j(u) £ L‘^{0,l;L‘^{dpt;'R‘^)), 

C4^((0,l) xR'^), lim (l)’^ = i{u) in L^dpt)), 

k^oo 


lim 

/c—)-oo 


j{u) in L2(0,l;L2(dpt;R‘^))|. 


One can see that if u G L^(0,1; i7^(dpt)) then := {i{u)t,j{u)t) G 
H^{dpt) is well-defined for a.e. t G (0,1), and 

Il^lli2(0,l;i7i(dpt)) ^ / Il^ill77i(dpt)d^' 

0 

In the same spirit as above, we will abuse the notation and write u 
instead of i(u) and Vu instead of j(u), for any u G T^(0,1; H^(dpt)). 
Then the Hilbert norm in L^(0, 1; (dpt)) is 

\Mhio,i-,HHdpt)) = (|Vui(x)p + |«t(a:)|2) dt. 

• Given a narrowly continuous curve p G Ct„([0,1]; A4^) and u G 
L2(0,l;ifi(dpf))' we say that 


dtpt + di \{ ptVut ) = ptut 


(1.1) 
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is satisfied in the distributional sense P'((0,1) x M'^) if 



1 


dt(t)<lpt{x)dt = / / {V(j) ■ Vut + (t)Ut)dpt{x)dt 


for all (p G C^((0,1) x M*^). We will often refer to (11.11) as the non¬ 
conservative continuity equation. For any such p G Cu,([0,1]; 
an easy approximation argument shows that in fact 




( 1 . 2 ) 


holds for all fixed 0 < s < t < 1, which is equivalent to the previous 
weak formulation if 1 1 —)• pt is narrowly continuous. 

• The bounded-Lipschitz distance between two measures po,pi G 
is 

dBL{po,Pi)= sup / (f){dpi-dpo) ■ 

||0||Lip<l 

Useful properties are that the metric space ,dBL) is complete 
and that dsL metrizes the narrow convergence on Ai~^. These facts 
are well-known m if we replace Ad"*" by the space of probability mea¬ 
sures. The reduction of the general case to measures of unit mass is 
not involved. We remark that these properties may also be indirectly 
deduced from the results of |40| and m- Another useful observation 
is that the supremum can be restricted to smooth compactly sup¬ 
ported functions. This follows from the tightness of a set consisting 
of two finite Radon measures. 

• Finally, Br are open balls of radius i? > 0 in centered at zero, 
and C is a generic positive constant. 

2 . The metric space (Ad+,(i) 

2.1. Definition and first properties. 

Definition 2.1. Given two finite Radon measures po,pi G we de¬ 


fine 



( 2 . 1 ) 
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where the admissible set A{po, pi) consists of all couples (pt, such 

that 

' PGC^([0,1];7W+), 
p|t=o = po; p\t=i = pi, 
ueL^{Q,T-H\dpt)), 

dtpt + div(ptVtit) = ptUt in the distributional sense. 

As already anticipated, we shall prove shortly that 
Theorem 1. d is a distance on A4+(]R'^). 


Before going into the proof we need a preliminary result, to be used repeat¬ 
edly in the sequel: 


Lemma 2.2 (Bounded-Lipschitz and mass estimate). Let p G Ctu([0,1]; Ad"*") 
be a narrowly continuous curve, assume that the non-conservative continuity 
equation dtPt + div(ptVtit) = ptUt is satisfied in the distributional sense for 
some potential u G L‘^{0,T; H^[dpt)) with finite energy 

E = E[p-u] = ||u||i2(o,r;Hi(dpt)) = ^ \ut\'^)dpt{x)dt, 

and let M := 2(max{mo,mi} -L E) with mi = dpj(M'^). Then the masses are 
bounded uniformly in time, mt = dpt(M'^) < M and 




/ 4>{dpt - dps) 
Jr<^ 


for all 0 < s <t <1. 


< {\\Vfi\\oc + \moc)VME\t-s\^/^ 

( 2 . 2 ) 


Proof. By narrow continuity of t e-)■ pf the masses mt are uniformly bounded 

and m = max mt is finite. Using the Cauchy-Schwarz inequality in (|1.2p 
te[o,t] 

gives 



/ 4>{dpt - dps) 

jRd. 

< t([ (|V<(.p + H2)dp 

Js KJR'i 


{V(j) ■ VUr + fiufijdprdr 
1/2 


(iViiTr + kTr)dpr dr 


1/2 


< (||V(/>||oo + ||(?!'||oo)\M-( / / (iVrt^p-h |rt^p)dp^dr ) 


1/2 


< (l|V0||oo + ||((>||oo)Vm.U|t - 
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and it is enough to estimate m < M = 2(max{mo, m-i} + E) as in our 
statement. Choosing 0 = 1 we obtain from the previous estimate |mt —m^l < 
V mE\t — Let tg £ [0,1] be any time when mtQ = m: choosing t = to 

and s = 0 we immediately get m < mo + VmE\to — < mg + VmE, and 

some elementary algebra bounds m < 2(mg + E). Exchanging the roles of 
po, Pi we get similarly m < 2(mi + E), and finally m < M. □ 

Remark 2.3. The paths and energies can easily he scaled in time as follows: 
if {pt,Ut)t&[o,i] connects po to pi with energy E[p]u\, then for any T > 0 the 

path (j)g,Us) = (connects po to pi in time s G [0,T] with energy 

^\pM = + \us\)dps^ ds 

^ ^ lo M)dpt'^ dt = ^E[p-,u]. 


Proof of TheoremUl Let us first show that d{po, pi) is always finite for any 
P0)Pi S Indeed for any finite measure i/g G Ad"*" it is easy to see that 

= (1 — t)^t'g and Ut = (ui, Vuf) = 1 ^, 0 ^ give a narrowly continuous 

curve t Ut & Ad"*" connecting uo to zero, and an easy computation shows 
that this path has finite energy E[iy,u] = 4dz^g(M'^) < oo (this curve is actu¬ 
ally the geodesic between z^g and 0, see later on the proof of Proposition 12.61 
for the details). Rescaling time as in remark [2.31 it is then easy to connect 
any two measures po,Pi G Ad”^ in time t G [0,1] by first connecting po to 0 
in time t G [0,1/2] and then connecting 0 to pi in time t G [1/2,1] with cost 
exactly E = 2(4d/9g(M'^) -|- 4d/3i(M'^)) < oo. 

In order to show that d is really a distance, observe first that the symmetry 
d{po,pi) = d{pi,po) is obvious by definition. 

For the indiscernability, assume that po, Pi G Ad^ are such that d{po, pi) = 
0. Let (/ 0 ^,u/) r„ ... be any minimizing sequence in (12.ip . i-e lim E[p^;u^] = 

d‘^{po,pi) = 0. By Lemma 12.21 we see that the masses m/ = d/9/(M'^) are 
uniformly bounded, sup m/ < M. For any fixed (j) G C/°(M'^) the 
Zg[0,l],fcGN 

fundamental estimate (|2.2I) gives 



(p{dpi - dpo) 


< (||V(/||oo + ||0||oo)x/M- 
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Since lim E[p^]U^] = cf{pQ,pi) = 0 we conclude that Ld </>(d/9i — d/Oo) = 0 

fc —>00 

for all (j) G thus pi = pQ as desired. 

As for the triangular inequality, fix any pQ,pi,u G Ai~^ and let us prove 
that d{pQ, pi) < d{pQ, i')+d{iy, pi). We can assume that all three distances are 
nonzero, otherwise the triangular inequality trivially holds by the previous 
point. Let now , u^)ig[o,i] be a minimizing sequence in the definition 

of d‘^{po,u) = lim E[p^]U^], and let similarly u^)t6[o il be such that 
fc—>-oo — ^ ^ 

d?{i',pi) = lim E\p^]U^]. For fixed r G (0,1) let {pt,Ut) be the continuous 

k^oo 

path obtained by first following (p^, by.^) from po to u in time r, and then 

following (j)^, from v to pi in time 1 — r. Then {pt ,i^t)t£[o 1 ] 

admissible path connecting po to pi, hence by definition of our distance and 
the explicit time scaling in Remark 12.31 we get that 

d\po,pi) < i?[p^u"] = -E[f-,u>^] + ^E[p^-u>^]. 

T 1 — T — 

Letting k ^ oo we obtain for any fixed r G (0,1) 

d^{po,pi) < -d^{po, v) + pi). 

r i — r 

Finally choosing r = ^ (0,1) yields 

1 1 

d^{pQ,pi) < -d^{po,iy) + - - d^(iy,pi) = (d(po,i^) + d(i^,pi)f 

T i — r 

and the proof is complete. □ 

As an immediate consequence of Lemma 12.41 we get 

Corollary 2.4. The elements of a bounded set in {M'^,d) have uniformly 
bounded mass. 

The converse statement is also true, see Corollary 12.71 below. Another 
property, easily following from Remark 12.31 is 

Lemma 2.5. L/(pt, Ui)ig[o,i] ^ narrowly continuous curve with total energy 
E then t pt is 1/2-Hdlder continuous w.r.t. d, and more precisely 

yto,ti G [0,1] ; d{pto,pti) < VElto 

Proof. Rescaling in time and connecting pt^ to pt^ by the path (p^, (ti — 
to)u<j)<j6[o,i] with t = tQ + {ti—to)s, the resulting energy scales as d?{ptQ, Ph) < 

F;[p;u] < £’|to - ill- □ 
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Before proceeding with the study of the topological properties of our metric 
we need some more results. 

Proposition 2.6. For any pQ S there holds d{pQ,t)) = 2y/mQ with 

mo = dpo{W^), and the geodesic is explicitly given by pt = {1 — t)‘^po and 
Ut = {ut,Vut) = 

Note in particular that Vttf = 0, which means that the optimal strategy 
to send a measure po to zero is always to “squeeze it down” without any 
transport. 

Proof. We start by showing that in the minimization problem in the definition 
of d‘^{po,0) one can restrict to paths of the form pt = X{t)po, and then we 
compute the optimal X{t) from which we recover {pt,Ut) with ut constant in 
space, i-e Vttt = 0. 

Step 1: no transport is involved. Let ipt,Ut)t^[op] be any admissible path 
connecting po to pi = 0 with finite energy. We claim that 

Pt ■■= —Po, Ut := {ut)^„. = — [ utdpt, Vut = 0 
mo mt Jud 

always gives an admissible path (i-e u = (it, Vft) G (dpt))) with 

lesser energy, where mt = dpt(M'^) denotes the mass at time t as before. Note 
that because the initial path pt is narrowly continuous we have in particular 
that t mt is continuous, and we can thus assume that mt > 0 in [0,1) 
(otherwise pt^ = 0 is attained for some time to G (0,1) so whatever happens 
after to can only costs an unnecessary extra energy, and scaling in time 
t = sto with s G (0,1) decreases the total cost). This continuity also shows 
that Pt connects po to 0, since mt ^ 0 implies narrow convergence pt ^ 0 
when t —)• 1. Since Ut = {ut)^p^ is constant in space and Vut = 0 we have by 
Jensen’s inequality 

^[P:^]=J^ [{0 + {ut)lpjmt'j dt 

~ lo ^ (^j^Jutl'^dpt^ dt < E[p;u], 

which shows that {pt,Ut) has lesser energy as claimed. 

It remains to show that this path solves the non-conservative continuity 
equation. Taking (j){x) = 1 in the weak formulation of the non-conservative 
continuity equation satisfied by (pt, Uj) and exploiting sup^ mt < M it is easy 
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to check that -^mt = fj^d Utdpt G L^(0, 1). Then in the sense of distributions 
T>'((0, 1) X we have 


dtPt 




and we conclude recalling that by construction ViXi = 0 in the advection 
term. 

Step 2: computing the geodesic. By the previous step it is enough to minimize 
over all {pt,Ut) such that pt = A(t)po with \{t) > 0 in [0,1), and constant-in- 
space potentials Vttj = 0. For such paths it is easy to realize that necessarily 
Ut = thus we only have to solve the minimization problem 


/ r 

X'(t) 


m 


X{t)dt : A(0) = 1, A(l) = 0 


where mo = d/?o(lK‘^)- It is a simple exercise in the calculus of variations to 
check that the unique minimizer is 


A(t) = (l-t)^ u{t) = 


m 

m 


and the explicit computation finally gives 

d^{po,0)=mo [ \u{t)f X{t)dt = mo [ 
Jo Jo 


-t' 

pt 

r 

2 

Jo 

1-t 


Pt = X{t)po = (1 - tfpo 


(1 — t)'^dt = 4mo 


as claimed. 


□ 


As an easy consequence of the previous Proposition 12.61 we obtain 

Corollary 2.7. Subsets o/Ad'*' with uniformly bounded mass are bounded in 
(Ad+jd) as d/3(M'^) < M ^ d{p,0) < 2\/M. 

Proposition 2.8 (Scaling properties). For any X G M'*' and measures 
Po,Pi S JXi~^ we have 


d{poAPo) 


‘Jy/mo 


1 — y/X 


(2.3) 


and 


d{Xpo,Xpi) = VXd{po,pi). 


( 2 . 4 ) 
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Remark that (|2.3p can be rephrased more intrinsically as: “if po, pi £ 
are proportional measures then d{po, pi) = 2\y/mo — and also that 

the previous Proposition 12.61 is a particular case with A = 0. 

Proof. For the first part it is easy to argue as in the proof of Proposi¬ 
tion 12.61 to see that it is enough to minimize over all paths pt = X{t)po, 
with ut = = 0 (i-e averaging in space decreases the energy) and 

the constraints A(0) = 1, A(l) = A. Finding the optimal X{t) is again a sim¬ 
ple exercise in the calculus of variations, and the explicit computation leads 
to (j2.3l) (the minimizer X{t) is again a second order polynomial). 

For (j2.4l) . note that our statement is trivial if A = 0 so we can assume 
that A > 0. We denote below pq = Apoi/^i = ^Pi- Let {Ptj'<^t)t£[oi] ^ 

minimizing sequence in the definition of (P{pQ,pi). Because both the non¬ 
conservative continuity equation and the energy are linear in p we see that 
(A/3 ^,u^) is an admissible path connecting pQ to pi, and 

d\po,pi)<E[Xp’^;u’^] = XE[p'^-,u^] ^ Xd^ipo,pi). 

fc^OO 

The other inequality is obtained similarly: if {p^, ) is a minimizing sequence 

for d^{pQ,pi) then {jp^,P^) connects po to pi, thus 

d\po,pi) < E[p^/X-,P^] = \E[p’^-y] ^ \d^{po,Pi) 

A k^oo A 

and the proof is complete. □ 

2.2. Topological properties. 

Theorem 2. The compactly supported measures are dense in {Ai~^,d): for 
any p £ and e > 0 there exists p' £ A4~^ compactly supported such that 
d{p,p') < s. 

Proof. Observe that p has arbitrarily small mass outside of B^i for large 
R. The argument goes in two steps: first we create an annular gap around 
lx I = R with arbitrarily small cost, i-e construct a measure p which has 
support in Br U \ Bji^s) for some small 5 > 0 such that d{p,p) < e/2 
and the mass of p outside of is still small. The second step consists in 

sending all the exterior mass dp(M'^\R/{+ 5 ) to zero while keeping the interior 
part p\br unchanged (the fact that we can do both simultaneously relies on 
the gap of size 5 > 0). In fact we will do so without modifying the original 
measure p inside Br, so that really in the end we will take p' = p\br with 
d{p,p') < e. 
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Step 1: creating the gap. For fixed ii > 0 let us first decompose 
p = p + p with p := plBa and p := 

Let also U{x) = Ur{x) G be any smooth function such that 


U{x) = 


\x\ — R if i? < |x| < i? + 1 
2 if\x\>R + 2 


and \U\,\VU\<2. 


By |38[ Prop. 3.6] we can solve 

dtpt + div(ft V[/) = PiU 


p\t=o — p, 

and t I— >• pj is narrowly continuous. Moreover by construction the initial 
datum p has support in \ Bji, so for t G [0,1] it is easy to see that the 
measure has support in (the characteristics ^ = VU{x) diverge 

radially away from Bji, with constant speed |V17| = 1 for |3:| > R). Denoting 
the mass 


mt := / dpt = / dp, 

Jwd. jRd.XBR 


/ dPt 

it is easy to check from the non-conservation continuity equation that 


d 


f 


— 

/ Udpt 

JRd 


< 


\U\dpi- < 2mt, 


so that 

for t G [0,1]. Define now 
Pt ■= P + Pt 


mt < e^*mo 


and Ut{x) = 


0 

U{x) 


if |x| < R 
if lx > R 


(observe that Ut is constant in time, Lipschitz-continuous in space, and uni¬ 
formly bounded by \ut{x)\ < 2). By construction for t > 0 the measure pt 
splits into a measure p with support in Br and a measure Pt with support in 
W^\BBj^f Exploiting the fact that these supports stay at distance t > 0 away 
from each other it is easy to check that (pt,Ut) solves the non-conservative 
continuity equation in the sense of distributions (for t > 0 the “matching” at 
|x| = i? is never seen!). The positive gap moreover allows to compute for a.e. 
tG (0,1) 

[ + \ut\‘^)dpt = [ + \ut\'^)dp+ [ (iVutp-k |ntp)dft 

XRd JBji 
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< 0 + (IIV^IIL + \\U\\l) [ dp, < Cm, < Ce^^mo, 

which shows that this path has finite energy. In particular by Lemma 12.51 
t e-)■ Pi is continuous with respect to our metric d, and d{p, p,) is small for 
small t > 0. 

Now for fixed e > 0 there exists R > 0 such that mo = dp(M'^ \ Bji) < e. 
Choosing t = 5 > 0 small enough we get that the measure p := p,=^ satisfies 
supp(p) <Z BrU \ Br+s), 

p\br = p\br, dp(M'^ \ Br+s) = m5 < e^^mo < 2s, 

and 

d{p,p) < s. 

Step 2: sending the exterior mass to zero. Choose now any smooth function 
u{x) such that 

\ f 0 if |x| < i? 

<x) - I ^ if \x\>R + S 

and let 

2 

Ut{x) :=-Y^u{x) and p, := pIrr + {1 - t)‘^plRd\B^^^. 


Since p, always has a gap between |x| < ii and |a;| > R+5 it is straightforward 
to check that {p,,u,) is an admissible curve connecting p to p' = pl_B^ = 
pi Br (hi particular p, is narrowly continuous and solves the non-conservative 
continuity equation), with cost exactly 





Ams < 8e. 


(|Vhtp \u,\‘^)dp, 


dt 


Indeed tit = 0 in Br, the measure p, does not charge the annulus i? < |x| < 
R + d, and by construction outside of Brj^s the potential u, it is exactly 
the geodesic between p' to zero, see the proof of Proposition 12.61 By the 
triangular inequality we finally get 

d{p, p) < d{p, p) -h d{p, p') <e + 


and, since e > 0 was arbitrary and p' is compactly supported, the conclusion 
follows. □ 
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Proposition 2.9. Let C be the set of Borel probability 

measures with finite seeond moments and VV 2 the quadratic Kantorovieh- 
Rubinstein-Wasserstein distanee defined on V 2 - Then for any po, Pi S "^2 
there holds 

d{po,Pi) < yV2{po,Pi)- 

Proof. It is well known |l3l 01] that (^ 2 ( 1 ^'^), VV 2 ) is a geodesic space, so 
we can connect po and pi by a constant-speed geodesic path pt in this 
Wasserstein space. By mi Theorem 13.8] there exists a velocity field vt € 
L°°(0,1; Lfi{dpt)) such that 

dtpt + div(/9tvt) = 0 (2.5) 

holds in the distributional sense, and 1 1 —)• p* is narrowly continuous (since it 
is continuous with values in the stronger Wasserstein metric topology). 

For almost every t G (0,1) the distribution ft = ~div(ptVt) G P^(M'^) de¬ 
fines by duality a continuous linear form on H^{dpt) with norm ||Ct||j7-i(dpt) — 

11 Vi 11 2,2 (jpj), thus by the Riesz representation theorem we can find a unique 
Ui = {ut,Vut) G H^{dpt) such that 

O^Rdpt) = Ct and ||ui||^i(dpt) = ||Ct ll/f-Rdpt) < \Wt\\L^{dpt)- 
In particular by definition of (., ■)H^(dpt) a.e. t G (0,1) there holds 

- div(piV'Ui) -F ptut = - div(piVi) in P'(M'^), 

and by (|2.5I) it is easy to check that dtPt + div(piVtti) = ptUt in the sense of 
distributions. Since t ^ pt is narrowly continuous we see that {pt,Ut) is an 
admissible path in the definition of our distance, and in particular 

d\po,Pi) < E[p-u] = ||ut||^i(dp,)dt < ^ I|vt||i 2 (dp,)dt. 

By the Benamou-Brenier formula jam], the right-hand side coincides with 
the squared Wasserstein distance W|(pO)Pi) and the proof is achieved. □ 

Theorem 3. The metric d is topologically equivalent to the bounded-Lipschitz 
metrie dsL, d{p^,p) 0 if and only if dsLip^^p) 0. Moreover, 

{Ai^,d) is a eomplete metric space, d metrizes the narrow convergenee of 
measures on A^'’“(M'^), and for any po, Pi G A4^ with masses mo, mi there 
holds 


dBL{po,Pi) < Wmo + mid{po,pi) 


( 2 . 6 ) 
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Proof of Theorem\^ We proceed in two steps: in step 1 we first prove that 
convergence in dsL implies convergence in d, and then in step 2 we estab¬ 
lish (|2.6I) and use this to deduce that any Cauchy sequence for d is Cauchy 
for dsL- Recalling that ,dBL) is complete, we see by step 2 that any 
Cauchy sequence in (Ai~^,d) is Cauchy in ,dBL) and therefore con¬ 
verges in ,dBL), so by step 1 it converges in {M+,d) as well. Also, 
since any converging sequence is Cauchy this immediately implies that any 
d-converging sequence is also dsL-converging. Finally, since dBi metrizes 
the narrow convergence clearly so does d. 

For convenience we split the first step into la and lb: the former essentially 
reduces the problem to compactly supported (sequences of) measures, and 
the latter concludes by renormalizing to unit masses and comparing d with 
the Wasserstein distance W 2 - 

Step la. Take any converging sequence —)• p in Because 

the bounded-Lipschitz distance metrizes the narrow convergence the masses 
converge m, and the sequence {p^} is tight. We want to prove that it 

converges in {M.~^,d). 

Since only a countable number of concentric spheres in can have a positive 
Radon measure, there exists a sequence Rn —)• oo such that dp{dBB.„) = 0. 
By pa Prop. 1.203], dp(A) < liminfdp^(A) for any open set A C and 

= lim dp^(Ri?„). Hence, dp|B^^(A) < liniinf dp'^js^^ (A) for any 


open A C IS 
Prop. 1.206] 


k^oo 
d 


A:—>-oo 


and dplij^^ 


Vn e N 


= lim dp'^ls^^ 
k^oo 


^). Consequently, by m 


P^\br^ 


k^oo 


p\br^ narrowly. 


(2.7) 


Owing to the tightness of {p^}keN using the same construction as in the 
proof of Theorem [2] one easily deduces that 


+supd(p*^|B p'') 


0 , 


thus by triangular inequality 
d{p'',p) < d{p^,p^ 
it suffices to prove that 
Vn G N : 


Br. 


,) T d(p |_b_h ) p\Bfi ) T ^(pIbji , p) 




k^oo 


p\br„ in (Ad+,d). 


( 2 . 8 ) 


Step lb. From now on we argue for fixed n, and denote for simplicity 
and pn = p\br^- Letting = dp^iBaJ and nr„ = dp(Rij„) 
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we see by (|2.7p that —)• uin when k —^ oo. If the limit dpn(^'^) = rUn = 0 

then pn = 0 in A4^, so by Proposition 12.61 we have d{p^, pn) = d{p^,0) = 
2a/—)• 0 and (|2.8I) clearly holds. Otherwise using ^ > 0 we have 

by the scaling Proposition 12.81 


rur, 


d{Pn, Pn) <d( p’^, ) + d ( '-^p: 


m: 


rrir, 


m: 


n) Pn 


= 2 


+ y/m^d 




Pn 

m„ 


and it suffices to prove that the sequence of renormalized probability measures 
'Pn — ^ converges to p„ = Note that by construction 'Pn^'Pn 
ported in a fixed ball , so in particular they have uniformly bounded sec¬ 
ond moments. Applying Proposition 12.91 we see that d{p^,p^) < VV 2 (p^,/d„), 
and recalling that —)• nin > 0 and that p^ —>■ pn narrowly it is easy to see 

that Pn —t Pn narrowly as well. Applying |43[ Thm. 7.12] we conclude that 
VV’2(Pn;Pn) 0, whence d(pn,Pn) 0 (12.81) holds as desired. 

Step 2. Fix po, pi, and let (pj, Ut) be any admissible path from po to pi with 
finite energy E. Taking the supremum over (p in (12.21) we get dshipo^ Pi) < 
\/ME, where M = 2(max{mo,mi} -|- E) as in Lemma 12.21 Choosing now 
a minimizing sequence instead of an arbitrary path and taking the limit we 
essentially obtain the same estimate with E = limFi[p^;u^] = d^(po,pi), 
whence 


dBL{po,Pi) < \/2(max{mo,mi} -h d'^{po, pi))d{po, pi). 

By the triangular inequality and Proposition 12.61 we control (i^(po,pi) < 
4(d^(po,0) -|- d^(0,pi)) = 16(mo -|- m-i), which immediately yields (|2.6p . 

Finally, let p^ be a Cauchy sequence in (Ad'*', d) with mass = dp^(M'^). 
Since Cauchy sequences are bounded we control 4m^ = d^(p^,0) < C uni¬ 
formly in k, thus from (12.6p we see that 

dBL{p^,p‘^) < Qy/mP + m<id{pP,p'^) < Cd{f/,p'^). 

As a consequence p^ is Cauchy for the bounded-Lipschitz distance and the 
proof is achieved. 

□ 


Remark 2.10. Observe that we did not prove that the distances d and dBL 
are Lipschitz equivalent, in the sense that the identity map id : {Ai~^,d) 
,dBL) not be globally bi-Lipschitz. This is actually impossible due 
to the different scalings: From Provosition \2.8\ we know that d(Apo,Api) = 







20 


Stanislav Kondratyev, Leonard Monsaingeon, and Dmitry Vorotnikov 


^/\d{pQ, pi), but it is easy to check that dBLi^Po, ^Pi) = ^d{po, pi). However, 
our estimate (12.61) shows that id : {M.^,d) —)• ^dsL) is Lipschitz on 

bounded sets. 

2.3. Characterization of Lipschitz curves. 

Theorem 4. Let {/Ot}ig[o,i] be a L-Lipschitz curve w.r.t. our metric d. Then 
there exists a potential u € L^(0, T,H^{dpt)) such that 

dtpt + diY{ptVut) = ptut in P'((0,1) x M'^), 

and 

htWmidpt) < L a.e.tG (0,1). 

Conversely ift pt € is a narrowly continuous and (/Ot, iit)te[o,i] solves 
the non-conservative continuity equation with ||ut||//i(dpt) < L for a.e. t G 
(0,1), then t pt is L-Lipschitz with respect to the distance d. 


Before proceeding with the proof we will need the following technical 
lemma: 

Lemma 2.11. For any po,pi G Ad”*" and e > 0 there exists a narrowly 
continuous curve pt G Ci„([0,1], connecting po to pi and a potential 

u G L^(0,1; H^{dpt)), both depending on e, solving the non-conservative con¬ 
tinuity equation and such that 

VrG[0,1]: d{po,pr) < il + e)d{po,pi) (2.9) 

with also 

d^{po,pi) < E[p;u] = j dt < {I + efd^{pQ, pi). 

( 2 . 10 ) 

Proof. If {/?^, is any minimizing sequence in the definition of d^{pQ,pi) 

then (I2.10p is obviously satisfied for large k > kQ and we only have to check 
that (|2.9I) holds as well if k is large enough. Note that for any A: G N and 
fixed r G [0,1], a simple time reparametrization s = rt gives an admissible 
curve (p 5 ,Us)se[o,i] := {PTs,TUrs)sGlo,i] connecting po to p^ in time s G [0,1]. 
By definition of our distance, changing variables, and because r < 1, we get 
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= T 




Then for k large enough the last 
conclusion follows. 


(|Vu,Y + dt 

- fo + de¬ 
tenu above converges to d?‘{pQ,pi) and the 

□ 


Proof of Theorem^ The argument is similar to |44[ Thm. 13.8] and |3l 
Thm. 8.3.1]. Fix any if G Since we assumed that 

Vt,sG[0,1]: d{pt, Ps) < L\t - s\ 

and because locally our distance is Lipschitz-stronger than the bounded- 
Lipschitz one (see Remark 12. lOp . we see that 

t i-A T(t) := / if dpt 

is locally Lipschitz thus differentiable almost everywhere. Fix any time tg G 
[0,1] where T is differentiable, and let /in —^ 0 with tg + hn S [0,1]. Taking 
e PO = Pto, and pi = pto+hn, let (p", u")ig[o,i] be any curve from 

Lemma 12.111 connecting pt^ to Pto+hn la time t G [0,1] while solving the 
non-conservative continuity equation and satisfying (|2.9l)(|2.10p . Using the 
Cauchy-Schwarz inequality (first in space and then in time) gives that 


7 -/ ^{dpto+hr, - dpto) 

a-n JRt* 

^ n [ {vif-vu2+ifu^)dpf) 

f^n jg VjRd / 

< {^jj\Vif\^ + \if\‘^)dp^'^ d/) 

:=A.n 


d!{tg + hn)-d!{tg) 

hn 


At 

1 

2 


1 

hn 
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From the previous Lemma 12.111 and the definition of we first estimate, 
with e = ^ in (|2.10D , 

limsupBn < limsup(l + l/n) ^^^*° i^o+fen) ^ 

n^oo n—)-oo f^n 

In order to handle the term An, let us define by duality the measures S 
A^+((0,1) X as 


V(/.GCfe((0,l)x] 


(^(t, x)dfi^{t, x) := 


0 KJkA 


(t){t,x)dp^{x) dt 


(0,l)xII 


Using again Lemma 12.Ill and in particular (12.91) with £ = 1 /n, we see that 
for all fixed t G [0,1] there holds 


d{pto,Pt) < (1 + ^/n)d{pto,pt^+hn) 0 
when n —)• oo (because the path pt is Lipschitz). By Theorem [3] we get 
Vt G [0, 1 ] : Pt ^ Pto narrowly. 


A simple application of Lebesgue’s dominated convergence shows that /r” 
converges narrowly to the measure pt similarly defined by 


V0gC6((O,1) xR-^*) 


(/>(t, x)dp{t, x) : = 


(t){t,x)dptQ{x) dt 


(0,l)xII 


Since we consider if; = 'ij)[x) only and the limit p actually does not act in 
time we get 


A. 


0 


(IW’P + \i^?)dpto dt 


\ 1/2 


(|vv^|2 + \i^?)dpto ) = UWmidpt,)- 

* / 


Thus for fixed G (R*^) and at any point of differentiability to G [0,1] we 
get that 





The distribution C(^o) G 'D'(R‘^) defined by (C(^o)) = ^(^o) thus 

a continuous linear form on {{^{dpt^) with ||C(^o)||iL-i(dpto) — 
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Riesz representation theorem in H^{dptQ) we can then find a unique potential 
UtQ € H^{dptg) such that 

C{to) = ^ + PtoUto = C{to) in V'{R^), 

( 2 . 11 ) 

and 

ll“tollHi(dpto) = llC(io)||H-i(dpto) < L 
(if ptQ = 0 in A^+(M'^) we simply define = 0). 

Recalling that 'l'(t) = ip^Pt is a.e. differentiable, we see by (12.111) that 

— [ ijjdpt = [ {V'lp ■Vut + 'iput)dpt a.e. tG(0,1). (2.12) 

A subtle issue is that the “almost every t G (0,1)” set of differentiability of 'I' 
might depend here on the choice of ?/>. Arguing by density and separability 
as in |44[ Thm. 13.8] we can conclude nonetheless that (12.121) holds for 
any ijj G C^(M'^), which is of course an admissible weak formulation for 
dtpt + div{ptVut) = ptUt- Moreover by construction we have 

l|nt||iyi(dpt) < L a.e. t G (0,1) 

as desired. 

Conversely, assume that (pt, Ui)ig[o,i] solves the non-conservative conti¬ 
nuity equation with ||ui||p^i((jpt) < L for a.e. t G (0,1) and that p G 
Cw{[0, Ij; A4”’'). Then clearly the total energy E < L^. For any 0 < to ^ h 
is easy to scale in time and connect pto, Pti with cost (ti—to) ft^ ll/f^dpt)*^^’ 
thus d?{pt^,pt^) < L^|ti — top and the proof is complete. □ 

2.4. Lower semicontinuity of the metric with respect to the weak-* 
topology. 

Definition 2.12. Let (X, g) he a metric space, a be a locally compact Haus- 
dorff topology on X. We say that the distance g is sequentially lower semi- 
continuous with respect to a if for all a-converging sequences -A x, -A y 
one has 

g{x,y) < liminf g{xk,yk)- 
k^oo 

Theorem 5. The distance d is sequentially lower semicontinuous with respect 
to the weak-* topology on A4^. 

Proof. Consider any two converging sequences 

Pq ^ po, Pi ^ Pi weakly-* 
fc^oo fc—>-oo 
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of finite Radon measures from For each k, the endpoints and 

can be joined by an admissible narrowly continuous path (pj, Uj )tg[o,i] with 
energy 

E[p^-y]<d\plp’i) + k-\ 

Due to weak-* compactness, the masses mg = (i/9Q(M'^) and m\ = 
are bounded uniformly in A: G N. By Corollary 12.71 the set Ufeg^lpg, 
is bounded in ,d), thus the energies E[p^\u^] and the masses m\ = 
dp^(M'^) are bounded uniformly in /c € N and t G [0,1] 

m!t<M and E[p’^]U^]<E. 

By the (classical) Banach-Alaoglu theorem with G (Cq)*, all the curves 
{Pt)te[o, 1 ] lie in a fixed weak-* sequentially relatively compact set ICm = {p G 
Ai~^ : dp(M'^) < M} uniformly in k,t- By the fundamental estimate (12.21) 
we get 

(j){dp'l - dpj) 

for all (j) £ cl, which implies 

Vt, s G [0,1], VA: G N : dBL{Ps,Pt) < C\t - s\^l\ 

Invoking the completeness of dfii,)) the above uniform 1/2-Holder 

continuity w.r.t. dsi) the sequential lower semicontinuity of dsL with respect 
to the weak-* convergence (Lemma 15.11 in the Appendix), and the fact that 
Pi G ICm, we conclude by a refined version of Arzela-Ascoli theorem (Sj Prop. 
3.3.1] that there exists a dsi (thus narrowly) continuous curve (pt) 4 g[o,i] 
connecting po and pi such that 

Vt G [0,1] : Pt —t Pt weakly-* (2.13) 

along some subsequence A: —?> oo (not relabeled here). Let Q := (0,1) X 
and be the measure on Q defined by duality as 

V(/gCc(Q): [ (j){t,x)dp^{t,x)= [ ([ (f){t,.)dpi'\ dt. 

JQ Jo \JR'^ / 

Exploiting the pointwise convergence (|2.13p and the uniform bound on the 
masses m^ < M, a simple application of Lebesgue’s dominated convergence 
guarantees that 

p!^ —weakly- * in 



< V^\t-s\^/\\\V(l)\\oo + \moc) 
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where the finite measure S is defined by duality in terms of the 

weak-* limit pt = limp^ (as was in terms of p^). Then the energy bound 
reads 

< d\plp\) + k-^ < C. 

We are going to apply a variant of the Banach-Alaoglu theorem, Proposi¬ 
tion 15.21 in the appendix, in the space 


X = Cl{Q). 


Namely, we set 

ll'/’ll = + \\ 4 >\\ l °°{ q )^ 

ll<(>l|fc = (^ (|V0p + |0|2) d/) , A: = 0,1,..., 

and define the linear forms 

= [ {Xu'^ • d/. A: = 1,2,.... 

JQ 

The separability of C^{Q), the weak-* convergence of uniform bounded¬ 
ness of the masses of pf^{Q) < M, and the Cauchy-Schwarz inequality imply 
that the hypotheses of our Proposition 15.21 are met with 


Cfc := ||<Pfc||(x,||.|U)* < X\\L^io,i-,m{dp>=)) = (P{p^, p\) + k-^. 

Consequently, there exists a continuous functional ipQ on the space {X, || • ||o) 
such that up to a subsequence 

yct>eCl{Q)-. ['([ {Vu^-V<p{t,.) + u'l<Pit,.)} dp’l)dt ^ ipo{<P) 
Jo \Jmd 1 J / k-^oo 

with moreover 

||¥?o||(x,|M|o)* < bminfd(po,P?)- (2-14) 

Let Nq C X be the kernel of the seminorm || • ||o. By the Riesz rep¬ 
resentation theorem, the dual (X, || • ||o)* = {X/Nq, || • ||o)* can be iso- 
metrically identified with the completion X/Nq of X/Nq with respect to 
II • llo, which is exactly L^(0,1; (dpf)). As a consequence there exists 

u = (u, Vu) € T^(0, T ; H^{dpt)) such that 

= [ {Xu ■ Vcp + u4>) dp^ = [ ( f {Vu ■ Xcj) + u4>) dpi\ dt 

Jq Jo \JR<i J 

and 

l|li||L2(0,l;Hi(dpt)) = II¥^o||(A,||.||o)*i 
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and it is straightforward to check that (p,u) is an admissible curve joining 
p^,p^ (the above convergence is enough to take the limit in the weak for¬ 
mulation of the non-conservative continuity equation). Recalling (I2.14p . it 
remains to take into account that by the definition of our distance 

d^(po,Pi) < E[p-u] = ||u||i2(o,i;//i(dpO) = ll‘^o||?x,||.||o)‘ ^ hminfd2(pg,p^). 

□ 


2.5. Existence of geodesics. We are now in position of proving an im¬ 
portant result, namely the existence and characterization of geodesics for 
our metric structure. We give two proofs: one is more constructive and in¬ 
spired by the optimal transport theory, and another one is shorter and more 
abstract. 


Theorem 6. (A4+,d) is a geodesic space, and for all po,pi G the 

infimum in m is always a minimum. Moreover this minimum is attained 
for a (narrowly continuous) curve p such that d{pt,Ps) = ~ s\d{pQ,pi) and 

a potential u G L^(0,1;//^(dpt)) such that ||ut= cst = d{po,pi) for 
a.e. t G [0,1]. 


In Section [3] we will show that {Ai~^,d) can be viewed as a formal Rie- 
mannian manifold, with tangent plane TpAi^ ~ ~ — div(/?V«) -|- pu : 

= 11 u| I//I (dp)- In this perspective Theorem [6] can 

+ \ rJ 

= cst = d{pQ,pi), which should be expected 


u G H^{dp)} and 
be interpreted as 


dpt 

dt 


along constant-speed geodesics {pt)t&[o,i] connecting po,pi. 


Proof 1 via time reparametrization. Fix any po, pi and let {pt j'^t) t£[o i] 
a minimizing sequence in E[p^-,u^] = ||uj ^..df —)• d‘^{po,pi). We 
first claim that we can assume without loss of generality 

VO<to<ti<l: d^{pl,pl)<\to-ti\^E{p’^-u’^]. (2.15) 

Indeed using a simple arclength reparametrization (Lemma 15.3l in the Appen¬ 
dix) we can assume that lln^ll^i^dp^) “ ~ E[p^; u^] is constant in time for 

all A: G N. Scaling in time t = to + {ti— tQ)s as before, we get by definition of 
our distance and Remark 12.31 that d?{pi^, p^f) < |ti — to| ll^^ll^i(dp'')'^^ ~ 
(ti - tofE[p'‘;u'‘]. 

Now because the energies E[p^\u^] are bounded (12.151) shows that the se¬ 
quence of curves {t e-)■ are uniformly 1/2-Holder continuous with respect 
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to our metric d, and arguing as before the masses < M are bounded 
uniformly in t G [0,1] and A: G N, thus lie in a fixed weakly-* relatively 
sequentially compact set JCm = {p '■ d/o(M'^) < M}. By Theorem [5] we know 
that d is lower semi-continuous with respect to weak-* convergence, and ap¬ 
plying again the refined Arzela-Ascoli theorem [3l Prop. 3.3.1] we conclude 
that there exists a d-continuous (thus narrowly continuous) curve p such that 

V t G [0,1] : Pt ^ Pt weakly- * . 

From ()2.15p this also implies 

d^{Pto,Pti) < liminf |to = |ti - to\^d‘^{po, pi) 

K —>-00 

for all toTi £ [0) !]• By the triangular inequality we conclude that in fact 

Vto,Ai e [0,1] : d{pto,pti) = \ti -to\d{po,pi), 

and in particular the curve t pt is L-Lipschitz with L = d{pQ,pi). 

Applying Theorem |4| we see that there exists a potential Ut solving the non¬ 
conservative continuity equation such that ||ut< L a.e. t G (0,1), 
and because 

= d^{po,pi) < E[p;u] = 

we conclude that in fact ||ut = L = d{po,pi) a.e. t G (0,1). □ 

Proof 2 via midpoints. We first observe from the definition of our distance 
that {JiA~^,d) is a length space (e.g. by an application of the almost mid¬ 
point criterion, HI Thm. 2.4.16(2)].). By Corollary 12.41 and the (classical) 
Banach-Alaoglu theorem, the weak-* topology is d-boundedly compact. Now 
Lemma 15.41 (variant of the Hopf-Rinow theorem in the Appendix) together 
with Theorem [3| and Theorem |5] imply that {Ai~^,d) is a geodesic space. The 
existence and claimed properties of the minimizer follow from Theorem 01 
exactly as in Proof 1. □ 

Remark 2.13. The geodesics can be non-unique, see Section \3.5[ 

3. Underlying geometry 

We show here that our distance d endows A4^(M'^) with a formal Rie- 
mannian structure. We closely follow Otto’s approach [39], which was orig¬ 
inally developed for the optimal transport of probability measures in the 
Wasserstein metric space {V 2 ,yV 2 )- Refer the reader to the particularly clear 
exposition in |43l 04] . 
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3.1. Otto’s Riemannian formalism. Let us recall that if is a smooth 
differential manifold then the tangent plane TxA4 at a point x £ M can 
be viewed as the vector space of tangent vectors ^|^_q of all curves 
t ^ Xt £ M passing through x(0) = x. By Theorem |4] and Theorem [6] 
we know that, given any fixed endpoint p £ A4~^, there always exists a 
constant-speed geodesic pt connecting p\t=o = p to arbitrary u £ and 
parametrized as 

f dtpt = - d\Y{ptVut) + ptut in ^'((0,1) x 

1 11 lit 11 Lfi (dpt) = ^ ^ (O’ 1)- 


Since 1pt is a constant-speed geodesic in it should be a curve, 

and, with a slight abuse of notation, we naturally identify the dtPt distri¬ 
butional term above with a tangent vector ^ £ Tp^Ai~^. According 

to Ct = — div(ptVMt) -|- put we see that the tangent vector Q = ^ corre¬ 
sponds to a potential Uj = {ut, ^Ut) £ H^{dpt). Assuming for simplicity that 
^, Ut can be somehow evaluated at t = 0+, we see that any geodesic passing 


through the fixed endpoint p|t=o = p gives a tangent vector ^ which 

t=o 

thus corresponds to some u|f=o £ H^{dp). This strongly suggests to define 
the tangent space in terms of potentials as 


TpM^ := {C = “ div(pVu) + pu : u = (u, Vu) £ H^{dp)} 


and 


\TpM+ ■— l|ll|liLi(dp) “ 


Observe that, given ^ and ignoring all smoothness issues, the elliptic PDE 


— div{pVu) + pu = C (3-1) 

is coercive in H^{dp) so the correspondence between tangent vectors and 
potentials u is at least formally uniquely defined. By polarization this auto¬ 
matically defines the Riemannian metric on TpAi~^ 

:= (Ul,U2)^WdA = / {Vui-Vu2 + uiU2)dp, 

where the subscript p in the left-hand side highlights the dependence on the 
base point p £ Ad"*" in the weighted H^{dp) scalar product. A formal but 
useful remark is that 


/ dp dp 
\ dti ’ dt2 


/ dp dp \ 
\dti' dt2/p 


(Vui • VU2 + UiU2)dp 
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= / U2Cidx = / U2i9ti/0dx= / iiiC2dx = / ui^tj/odx. 
yRt* Jr'* Jr^* 

Observe that with this formalism our definition (|2.ip simply reads 


(3.2) 


d^{po,Pi) = inf 


dpt 


dt 


dt 


TnM+ 


where the infimum is taken over all admissible curves connecting po^pi- This 
is nothing but the classical Lagrangian formulation of (squared) distances in 
a reasonably smooth Riemannian manifold. 


Remark 3.1. It is worth pointing out that p = 0 is a degenerate point in our 
pseudo-manifold J\A~^ since the tangent space is zero-dimensional. 


3.2. Differential calculus in (Af“'“, d) and induced dynamics. Now that 
we defined a Riemannian structure on J\A^ in terms of the distance d, one 
would like to differentiate functionals M, or in other words com¬ 

pute gradients ”grad^ J^(/9)“ with respect to this Riemannian structure. This 
has already been done in the setting of optimal transport in the Wasserstein 
space {V 2 ,y\^ 2 ), see |39[ H3] and references therein. The approach is once 
again very similar, and still formal. 

Let us recall that if (Ai ,g) is a smooth Riemannian manifold and J- : M ^ 
M, the Riemannian metric tensor g and the Riesz representation theorem 
allow to pull back the differential DIF{x) G TfAi, which is a cotangent 
vector, to a unique tangent vector gradJ-'(x) G T^Ai'. For all curves 
t Xt passing through x\t=o = x with tangent vector = C the 

gradient grad 7^ is defined by duality via the chain rule 


= {DA{x)X)t*m,t,m = 9x{gTadT{x),C)- 

t=o 

Mimicking this computation and washing out all smoothness issues, it is 
easy to deduce here that the gradients of functionals A : p G Af M are 
at least formally given by 


dt 


Hxt) 


grad,.F(p) = -div (pVf)+pf 


Wgvad^ F{p)\\t^M+ = 


5p 


midp) 


(3.3) 


where ^ denotes the first variation with respect to the usual Euclidean 
structure (e.g., if A{p) = f U{x)dp{x), then ^ = U). 
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With gradients available one can obtain induced dynamics on d), the 

main examples being gradient flows and Hamiltonian flows. More precisely, 
given some functional J^{p) on the J^-gradient flow is defined as 

$ = -gradrf J’(p). 


Similarly, Hamiltonian systems read 


dp 

dt 


-J- gradrfA’(p), 


where J is a given Hamiltonian antisymmetric operator (i-e a closed 2-form 
on TpAi'^). As an application of this Riemannian formalism we will study 
in Section |4] a particular gradient flow, and exploit the formal structure to 
derive new long-time convergence results by means of entropy-entropy dissi¬ 
pation inequalities. 


Second order differential calculus can also be developed: the Hessian can 
be formally defined as 


(Hessd J’(p) • C,C)p 


dt'^ 


geod 


HPt), 


(3.4) 


where differentiation should be performed along a geodesic path pt passing 
through p with tangent vector C, at time t = 0. In order to exploit this 
formula one needs to compute the geodesics with prescribed initial position 
and velocity, and the first issue is thus the very existence of these objects. 
As for the practical computation itself, consider for example J-{p) = dp: 
the second time derivative is 


[ d^^pt= [ dti-V ■ (ptVut) + ptut) = [ dtiptut) 

jRd jRd 

= / utdtpt + ptdtut = / pt{\S/ut\^ + -h dtut). 

jR‘i JR‘i 


Thus the second and key issue is to find an equation yielding an expression 
for dfUt or dt{ptUt) in terms of p, u and their spatial derivatives. The next 
two subsections are devoted to these questions. We will see that in the def¬ 
inition of the Hessian it is natural to replace the metric geodesics by some 
objects which we call trajectory geodesics. The latter possess fine minimiza¬ 
tion properties, exist for given initial value p and velocity C, and are described 
by nice PDEs, which, in particular, provide the required expressions of dtUt 
and dt{ptUt) in terms of p and u. 
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3.3. Trajectory geodesics. In order to introduce the concept of trajectory 
geodesic, we use a heuristic particle interpretation of the considered dynam¬ 
ics, which better illustrates the underlying ideas. 

Think of movement of charged particles in M'^, whose mass is fixed but 
whose charge can possibly evolve according to some a priori given law de¬ 
scribed later on. Assume first that we have a finite number i = 1... N of 
moving particles with position Xt{i) and charge kt{i). In terms of densities 
and curves in Ad”*" this dynamics can be formalized as 
N 

pt = J2h{i)rt{i) with rt{i) -.= 
i=l 


Given a reasonably smooth potential Ut{x), the function pt is the solution 
of the non-conservative continuity equation dtpt + div(ptVut) = ptUt if and 
only if 

= Vut{xt{i)) 

and 

— {\ogkt{i)) = ut{xt{i)) 

along the trajectory of each particle (this claim can be checked employing 
e.g. |38[ Prop. 3.6], see however Remark 13.21 belowl. Define the energy of 
each individual particle by 




2 

d , . 

2‘ 

Ei= kt{i) 

Jo 

— (logfet(z)) 

+ 




Then the total energy sums as 
N 


^ Jo \Jw 


dt. 


(|Vutp -F \ut\^)dpt dt. 


(3.5) 


(3.6) 


If the pair (pt, Uj)tg[o,i] were to be a geodesic (in the sense of Theorem [6|, 
then it should minimize the total energy E for fixed po pi, and thus also 

minimize each of the Eis: more precisely, there should hold 



d , 

2 

d . 

2‘ 

Ei = mm \ J 

^(logfe) 

+ 




xo{i) = xo{i), xi{i) = xi{i), ko{i) = kQ{i),ki{i) = ki{i)\. (3.7) 
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It turns out to be relevant to replace the condition of being a geodesic by 
that of being a minimizer of each Ei as in (|3.7I) . 

Remark 3.2. The above discussion is legitimate, e.g., if we have strictly N 
different particles for all times, i-e when the trajectories of the particles do 
not overlap and no mass splitting or coalescence occur. Much more general 
scenarios allowing for trajectory crossing are certainly possible, provided that 
the potential u loses spatial smoothness and the non-conservative continuity 
equation is understood in the weak sense. This is of course a delicate issue, 
which we shall ignore below by staying at the formal level. 

We carry out now a similar formal analysis in a more general framework, 
considering general measures pt as the superposition of possibly infinitely 
many indivisible infinitesimal particles. 

Let {pt, Ut)ig[op] t>e a generic admissible path in the sense of Definition 12.11 
and assume that the velocity field is smooth enough. Let T* be t-flow 
generated by the characteristic ODE ^ = Vnf(y), and define ^ M 

as the pushforward 

n = Tt#po. (3.8) 

Then rt is a curve in with constant mass mt = dpoClK"^)) and 

dtrt + div(rtVnt) = 0. (3.9) 

As in the previous discrete setting, we think of pt as the charge density of 
moving charged particles and decompose it as a product 

pt{x) = rt{x)kt{x). 

Here rt{x) defined by (13.81) is the particle density at time t (density of par¬ 
ticles without thinking of their charge), while kt{x) is the charge initially 
normalized as /co(x) = 1. When the particles move along their trajectories, 
they continuously change their charge according to 

dtpt + div{ptVut) = ptut- (3.10) 

Denote the Lagrangian time derivative by 

^ = 9, + Vu-V. 

Multiplying (13.9p by k and subtracting the result from (I3.inp it is easy to 
obtain rt^j^ = rtktUt, thus formally 

^(log/ct) = Ut 

as in the previous discrete setting of finitely many particles. 


(3.11) 
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Now switch to the Lagrangian framework and let Xt{X) := Tt{X) be the 
trajectory of a particle starting at xq{X) = X. From (13.lip we see that 

^xt = Vut{xt) and -^{log kt{xt)) = ut{xt). (3.12) 

Thus, the infinitesimal energy along the trajectory of a single particle Xt{X) 
is proportional to 


^1 

D , 

2 

D 

2‘ 

■ 

-id 

^ o 

II 


+ 




which is of course the continuous Lagrangian counterpart of (|3.5I) . For fixed 
X the functions x{t) = Xt{X) and k{t) = kt{xt{X)) depend only on time, 
becomes the ordinary time derivative, and the energy reads 

Ex{x,k) = j — \-k\x'f'^ dt. (3.13) 

The boundary conditions for x, k are naturally 


xo = X, xi = Ti{X), 


ko = 1 , ki 


PijTiiX)) 

ri{Ti{X))- 


(3.14) 


We temporarily forget about the origin of the functions x, k and consider 
the minimization problem 


Ex{x,k) = m.mEx{x,k) : x. k satisfy (|3.14p . fet > 0 for 0 < t < 1. (3.15) 


Definition 3.3. An admissible curve is called a trajectory ge¬ 

odesic if the corresponding x, k solve the minimization problem (|3.15p for 
every X £ 


Given a trajectory geodesic and X, we can write the set of Euler-Lagrange 
equations for (I3.15P : 


2k" \k'\‘^ _ 
~k W ~ 


(3.16) 


{kx'Y = 0. (3.17) 

From (|3.17l) we see that kx' is conserved along the trajectories, that is, the 
trajectories are straight lines in the base space 



(3.18) 
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but the speed \x'\ is non-constant and is equal to cst/k along the trajectories. 
Furthermore, (I3.16P can be rewritten in the form 

2(log k)" + I (log /c)'p = \x'\^. 

Recalling (I3.12p we find that 2^ut + |ntp = or equivalently 


dtut + = 0- 


Moreover, (j3.12p and (I3.18P yield = 0) i'® 


dt 


\Vut 


Dt viv«ti; 

Vut ■ V 


v^t 

\Vut 


= 0 . 


(3.19) 


(3.20) 


In order to calculate Hessians as in (13.41) , we need a procedure to construct 
a (trajectory) geodesic starting from any measure p with prescribed initial 
velocity G TpAi'^. Let u = (u, Vn) be the corresponding initial potential 
(formally) determined by (" = — div(/?Vu) -|- pu, and consider the solution 
{pt,^t)te[o,S] to the following problem: 


(3.21) 


dtpt = - div{ptVut) + ptut, 

dm = + iVuip), 

Po = Pi uq = u. 

A crucial and somewhat surprising observation is that Ut satisfies (|3.20l) : 
the reason is that (|3.19l) always implies (|3.20p . Indeed, we first compute 


;^(Vu,) = -lv(k| 


-h |Vntp) • Vut = -utVut, 


thus 

D 

m 


\A/ut 


D_ 

Dt 


C^ut) _ {^Vut) ■ Vut 


|Vn^ 


|Vut|3 


utVut , ^ utVut ■ Vut „ 
+ Vtti-—rn- = 0. 


\Vut 


\Vut 


Repeating and rearranging the argument above, we deduce that for each X 
the corresponding pair x, k solves the Euler-Lagrange equations (|3.16l) . (|3.17l) . 
Observe from (|3.13p that the infinitesimal energy Ex{x,k) is convex in 
x,k,k''. thus at least formally any solution of the Euler-Lagrange equations 
must be the unique minimizer, which by definition means that {pt,Ut) is a 
trajectory geodesic. 

We have just observed that an admissible curve {pt,Ut) is a trajectory 
geodesic if and only if it satisfies both the Hamilton-Jacobi equation (13.191) 
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and the non-conservative continuity equation. It sounds plausible that a 
trajectory geodesic should (at least formally) always be a geodesic (since it 
simultaneously minimizes the energy along every trajectory), but this claim is 
not necessarily true, see Section (33] The underlying obstacles are subtle and 
fundamental, as similar factors undermine the Riemannian structure of the 
Wasserstein space {V 2 ,y\^ 2 )-, see Remark 13.41 However, a weaker statement 
holds true: the trajectory geodesics have constant metric speed in the sense 

11 11 //^ (dpt) CSt. 

TptM+ 

and (|3.2p we find at least formally for any tra- 

= 2 [ dtutdtptdx+ [ + \ut\‘^)dtptdx = 0. 

jRd jRd- 

Remark 3.4. A straightforward computation shows that the geodesic equa¬ 
tions (|3.21l) imply 

dtpt = - dW{ptVut) -k ptut, 
dt{ptS/ut) = - d\Y{ptVut ® Vttt), 

Po = Pi uq = u. 

This problem is very similar to the geodesics equations for optimal time- 
dependent transport m of probability measures in the Wasserstein framework 

dtPt = — div(ptVrit), 

dtipt'^ut) = - div{ptVut (8) Vttt), (3.22) 

Po = Pi uo = Ui 

which is a particular case of the sticky particles system [7]. A trajectory 
geodesic formalism similar to the above one can of course be developed in 
the Wasserstein space (7^2) ^^ 2 ) (with k = 1, no charge is considered). The 
Wasserstein geodesics are trajectory geodesics in that setting. The reason is 
that the trajectories of the dynamical optimal transport problem have constant 
velocity |43| . thus the respective Euler-Lagrange equality x" = 0 is satisfied. 
The converse statement is not always true. Note that since the exponential 
map pt = expp^{t(j) for the Wasserstein geodesics cannot be properly defined 
for all velocities, the classical Otto calculus gal 111139] uses variants of the 
system (I3.22p for the calculation of the Hessian. Hence, from the perspective 


that 

dpt 

dt 

Indeed, using (I3.10p . (I3.19P 
jectory geodesic 
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of our formalism, the Otto calculus implicitly employs trajectory geodesics 
(without defining them) in place of metric geodesics. 

3.4. Hessians and A-convexity. The above discussion suggests the follow¬ 
ing 

Definition 3.5. The Hessian of a functional T : —?> M with respect to 

our structure is formally defined by 

(Hess^i H{p) • C, Op = (3.23) 

Here the path pt is determined by (I3.2ip where the initial potential u = 
{u, Vu) is related to the tangent vector ( via the elliptic equation — div(/9Vu)-|- 
pu = Q. 

Then we can give 

Definition 3.6. A functional T on A4+ is convex (resp. X-convex, A € 
M.) with respect to our structure provided {HesSd J~ {p) ■ C, i 0p — 0 (resp., 

(RessdHip) ■ C,Op > ^(C,Op = MM^i^^dp)) Pre¬ 
calculations of the Hessians are rather tedious, and as an example we 
only present here the final result for the internal energy |43) determined for 
absolutely continuous measures by 

^ip)=[ E{p{x))dx, 

jRd. 

where E : M'*' —>• M is a given measurable energy density. For this functional 
one can compute explicitly 


(BessdSip) ■ C,Op = ^P{p)T2iu) + P2{p)\Au\‘^ - {2P2{p) + P{p))uAu 
+ (q2{p) - ^Qip) - P2(p)^ |Vu|2 + ^Q2(p) - ^Q(p)^ |n|2}. 


Here we have used the notation 

P(p) = E'(p)p-E(p), 
Q(P) = E'{p)p, 


P2{p) = P'{p)p-P{p), 
Q2 {p) = Q'{p)p, 


r 2 (n) = \c( 

i,j=^ 
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3.5. Distance and geodesics between two one-point measures. We 

conclude this section with a highly illustrative example, which in particular 
will show that geodesics may be non-unique and that trajectory geodesics 
are not necessarily metric geodesics. Namely, we are going to describe the 
geodesics and calculate the distance between two one-point measures pQ = 
kodxQ and pi = kiSx^ with xq,xi G M'^, k^^ki > 0, and ^ := |xo — xi\ > 0. 
Observe that the case when one of the masses vanishes fits into Proposition 
12.61 and the case when ^ = 0 fits into Proposition 12.81 

In order to send po to pi, a first natural strategy is to treat the dynamics 
as the movement of a sole indivisible particle pt = k{t)5x{t) moving from xq 
to xi, as described in Section 13.31 The corresponding energy consumption 
for this transport strategy is 


f /-w 

d , ~ . 

2 

d 

2' 

u‘‘ 

^(logt,) 

+ 

-r Xt 

dt 



Etr{ko,ki,^) = mini / kt ^(log/c*) -h ^ x* dt 

xo = xo, xi = xi, ko = ko,ki = ki, fcj > 0 for 0 < t < 1 (3.24) 


cf. (13.71) . A laborious but rather elementary calculus of the variations shows 
that this minimization problem has a solution for 

? < 27r, 


which has the form 


Xt = Xo 


+ 


2 


xi - Xq 


kt = a{t - 6)^ -h c, 

arctan((t — b)-\/a/c) -|- arctan(6y^ a/c) , 
Etr{ko,ki,^) = 4a, 


where 


(3.25) 

(3.26) 

(3.27) 


a = ko + ki — 2 cos{^/2)\/koki, 

^ _ kp - cos{^/2)y/koki 

kp + ki — 2 cos{^/2)^/kp^' 
kpki sin^(.^/2) 

c =- ■ 

kp + ki — 2 cos(^/2)v 

Observe that the solution (I3.26P ceases to exist whenever ^ 


(3.28) 

(3.29) 

(3.30) 
xi — xo| > 27r. 
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Another natural strategy, made possible by the reaction term ptUt in the 
non-conservative continuity equation, is to kill all the mass at xq while si¬ 
multaneously creating mass at xi. This means that we treat the dynamics 
as the evolution of two distinct particles pt = + kt{ 2 ) 6 x 2 with fixed 

positions but non-constant charges, with 

fco(l) = fco, A;i(l) = 0 and ko{2) = 0,ki{2) = ki. 

The corresponding energy intake for this stationary strategy can be cal¬ 
culated using Proposition 12.61 viz. 

Est = 4:{ko + ki). (3.31) 

Note that this strategy also works for ^ > 2tt, contrarily to the previous 
transport one. 

We now make the ansatz that a geodesic joining two one-point measures 
Po = kQ 6 xQ,pi = kiSxi should use a mixture of these transport and/or sta¬ 
tionary strategies, corresponding to an evolution of (at most) three particles: 
the first one travels between xq and xi; the second one stays in xq and its 
charge goes to zero as time evolves; the third is always at location xi, and 
builds-up its charge starting form zero at t = 0 (this geometry may require 
a loss of smoothness of the driving potential u, cf. Remark 13.2p . That is, 

3 

Pt = '^kt{i)rt{i), rt{i) = 5x,(r), 

i=l 

xo(l) = xo,xi(l) = Xi, xo(2) = Xi(2) = Xo, xo(3) = xi(3) = xi, 
ko{l) + ko{2) = ko, ki{l) + ki{2,) = ki, fci(2) = 0,A:o(3) = 0. 

The heuristics behind this ansatz is that indivisible infinitesimal particles 
which have non-zero charge at the initial and final times are considered to 
be in the first bulk, the ones which have zero charge at time t = \ constitute 
the second bulk, and the remaining ones go to the third bulk. Denoting the 
independent parameters A:o(l) and A:i(l) by 70 and 71 , resp., we discover that 
the total energy intake is 

-E'(70) ^i) = El + E 2 + E'i 

= 4 [ 7 o -k 71 - 2 cos(^/ 2 )^ 7 o 7 i] -k 4(A:o - 7o) + 4(A:i - 71 ) 

= 4[A:o + ki - 2cos(^/2)^7o7iJ 

(here we implicitly restrict to ^ < 27r so that both strategies are feasible). 
For ^ < TT one can check that the minimum of £ 1 ( 70 , 71 ) is achieved for 
7 o = A:o, 7 i = ki, i-e the pure transport strategy is optimal. For > vr. 
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the energy minimum is achieved at 7 o = 7 i = 0 , i-e the geodesic only uses 
the pure stationary strategy. At the critical value ^ = vr, any combination 
of the two strategies (i-e any choice of the free parameters 70 € [ 0 , /cq] and 
7 i £ [ 0 , fci]) does an optimal job, and in particular we see that the geodesics 
are definitely not unique. Furthermore, for vr < ^ < 27r, the pure transport 
curve determined by (I3.25p - (l3.30p is a trajectory geodesic by construction, 
but not a metric geodesic (since the stationary strategy is less expensive in 
terms of energy). 

Let us now consider the particular case of probability measures fco = fci = 1 
and compare d?{6xQ,Sxi) with the Wasserstein distance which 

is clearly equal to = \xi — xqP in this example. Remember that by 
Proposition 12.91 our distance does not a priori exceed the Wasserstein one. 
For ^ > vr this is trivial since LVi = > vr^ > 8 = where d is accordingly 

computed using the stationary cost (|3.31l) because C ^ Oii the other hand 
for small ^ < vr the optimal strategy is the transport one, and from (|3.27l) - 
(j3.28l) we get >V| — — (8 — 8cos(^/2)) = o(^^). This means that our 

distance and the Wasserstein one agree at least at order two for short ranges, 
i-e 

) d(5xQ — o(|a7i xq \ ) 

(both yV ’2 and d being of order ^ = |a:i — Xo|). 

Remark 3.7. This threshold effect (at ^ = ir) is natural since the cost of the 
stationary strategy (j3.3ip is independent of the distance between the supports, 
while the pure transport strategy (j3.25p - (j3.30p really does depend on how far 
mass is transported. This might he relevant in the context of image processing: 
our distance non-linearly interpolates between “Wasserstein-like/transport” at 
close range, and basically total variation for large distances. A similar effect 
was discovered in m for the generalized Wasserstein distance considered 
therein. The bounded-Lipschitz distance also has such property. 

4. Application to a model of population dynamics 

As an application of the abstract ideas from the previous sections we con¬ 
sider a problem in spatial population dynamics. As originally proposed in 
(an El], the model describes a population of organisms inhabiting a domain 
14 C whose macroscopic distribution p{t, x) evolves according to repro¬ 
duction and migration. At each point x G 14 lies a prescribed quantity of 
resources m(x), and the local per capita reproduction rate is assumed to 
be proportional to the fitness m{x) — p. In the absence of displacement 
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of individuals the local population dynamics is thus given by the heteroge¬ 
neous logistic equation ^ = {m{x) — p)p. The spatial migration is then 
dictated by the desire of individuals to maximize their local fitness in order 
to reproduce as best as possible, corresponding to an advection with local 
velocity v = V{m — p). Assuming that the animals cannot exit through walls 
and thus imposing a no-flux boundary condition, one obtains the following 
fitness-driven model: 

{ dtp = div(pV(/? — m)) -|- p{m — p), x £ il,t > 0, 

p^ = p^, x£dn,t>o, (4.1) 

p{0,x) = po{x), xeQ. 

Here H C is a bounded, connected, open domain with smooth boundary, 
u is the outer normal to dQ, m{x) > 0 is a given function (assumed to be 
at least C^’“(H)), and p{t,x) is the unknown density of the population. We 
consider nonnegative solutions of (14.ip . so the initial datum po{x) G C(H) is 
supposed to be nonnegative as well. 

In the previous sections we constructed our distance d between measures on 
the whole space by means of non-conservative continuity equations dtpt + 
div(/9tVrit) = ptUt, which allowed to consider mass changes driven by the 
reaction term ptUt- Working in bounded domains H and imposing a no-flux 
condition on the velocity field Vut, so that the mass can only change through 
the reaction term, the exact same construction formally gives a distance d 
between measures in H (here we ignore all delicate issues related to boundary 
conditions and remain formal). As a result all the Riemannian formalism 
in Section |3] remains valid in the case of bounded domains, in particular 
formula (13.31) allows to compute gradients of functionals : A4))’(H) —)• M. 
Anticipating that (14.11) enjoys a comparison principle and that solutions to 
the Cauchy problem remain nonnegative, we rather consider p{t, x) as curves 
t pt £ A4^(H). Using (13.31) it is easy to check, again formally, that (14.11) 
can be written as the gradient flow 

^ =-grades(p), £{p) = ^ jjp{x)-m{x)\‘^dx (4.2) 

with respect to the distance d (just write ^ = P — fn). In the sequel we will 
often refer to £ as the entropy functional, as is common for gradient flows in 
metric spaces. In the view of (|4.2I) it is clear that the system is energetically 
driven by the fitness m — p, and it is natural to expect long-time convergence 
to the (unique) least-entropy stationary solution p{t) —)• m (in some sense to 
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be precised shortly). In fact this was proved in |19) by means of comparison 
principles, but without convergence rate. Our main result in this section 
states that convergence is exponential: 

Theorem 7. Assume that m{x) > 0 m fl. Then for any nonnegative initial 
datum pq £ C(n), po ^ 0; there is j = 'y{po,id,m) > 0 such that the weak 
solution of (SH) satisfies 

Vt > 0 : ||p(t) - < e“^*||po - "i|Il 2 (o)- 

The precise dependence of the rate 7 on the initial datum po will be derived 
along the proof, see Remark 14.31 below. 

Denoting the entropy dissipation along trajectories by 

'^{p) ■= = llg’^adrf£:(p)f, 

a classical way to retrieve exponential convergence p{t) —)• p* for gradient 
flows is to obtain an entropy-entropy production inequality 

v{p)>\{e{p)-e{p.)) (4.3) 

for some A > 0. This implies trend to equilibrium in the entropy sense 
\ ^ip*) with exponential rate A, and then using a Csiczar-Kullback 
inequality one usually deduces that p{t) —)• p* in some sense (typically ||p(t) — 
0)- We refer to |43| Chapter 9] for introductory material on this 

topic. 

Note that the least entropy stationary solution is here p* = m{x) and 
that £{p*) = 0. Formally multiplying (14.ip by p — m and integrating, the 
dissipation can be formally computed here as 

T^ip) = \\P - "i|lii(dp) = ~ “ m|^)pdx, 

which is of course consistent with V = ||grad^£l|p and the Riemannian 
formalism |ClrpAl+ = ll^ll//i(dp) from Section |3] Observe that, due to the 
very definition of £{p) = ^IIp ~ convergence in the entropy sense 

£{p(t)) —7- £{m) = 0 already implies convergence p{t) —^ m in the stronger 
L^(D) sense. As a consequence we can dispense from Csiczar-Kullback in¬ 
equalities and we only need an entropy-entropy production as in (14.3p . The 
latter classically holds (uniformly in the initial data) as soon as the entropy 
is A-convex for some A > 0 (in the sense of Definition 13.61 see Section |3] for 
second-order calculus). This can be tested here at least formally using the 
geodesic equations (I3.2ip . Unfortunately, for generic m the specific entropy 
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£{p) = Jq\p — m\^ does not appear to be A-convex for any A > 0. We use 
instead a generalized Beckner inequality (Theorem [8|), and as a consequence 
we only obtain exponential convergence for rates depending on the initial 
data as in Theorem [T] 


From now on we claim mathematical rigor. Following [19] weak solutions 
are defined as 


Definition 4.1. A nonnegative function 

p e C([0, +oo); n lL([ 0, n Li^,([0, Too) X Q) (4.4) 

is called a global weak solution of the Cauchy problem if it satisfies the 

identity 


— / pdtipdxdt+ / p{T,x)ip{T,x) dx — / p(){x)ip{0,x) dx 

Jo Jn Jn Jn 

= — / pV{p — m) ■ Vipdxdt + / / p{m — p)ipdxdt (4.5) 

Jo Jn Jo Jn 

for any T > 0 and (p G (^^([OjT] x Q). 


The existence and uniqueness for weak solutions of problem (|4.ip were 
established in |19) . as well as a comparison principle. We start by rigorously 
deriving the dissipation (entropy production) equality: 

Lemma 4.2. Any weak solution actually satisfies p G C([0, oo);L^(n)). The 
dissipation 

V{t) := [ (|V(p(t)-m)|2 + |p(t)-mp)p(t)dx 
Jn 

belongs to L]^g^([0, oo)), and 

|f(i) = -m 

in the sense of distributions TtfO, oo). 

Proof. Since p is uniformly bounded and p,m ^ it is 

straightforward to check that P G Lj(jp([0, oo)). By density of C^(n) in 
and testing (f{t,x) = r]{t)ijj(x) with p G Cf°{0,T) in the definition of weak 
solutions it is easy to see that dtp G ^^^,^([0, oo); (P,))*). By embedding 
CC LP‘{VL) C one classically obtains p G C([0, T]; L^(fl)) for 

all T > 0. 
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Since we know now that dt{p — m) = dtp S L^qj.([ 0, oo); {H^{Q))*) we can 
legitimately take p — m G oo); (Q)) as a test function in (14.11) . from 

which it classically follows that 

^S{p{t)) = ^ Q||p(t) -^11^2(0)^ = {dtP,p-m)(^fjly Hl 

= - [ (|V(p(t) - m)|2 + \p{t) - mp) p{t) = -V{t) 

Jn 

for a.e. t G (0,oo) and in Lj^p([0,oo)) as desired. □ 

Proof of Theorem Q In order to apply our generalized Beckner inequality 
and retrieve an entropy-entropy production inequality, we first need to bound 
the mass of p{t) from below. To this end we let p^{x) = vai\i{m{x) , pq{x)'\, 
and define p{t, x) to be the unique solution of the Cauchy problem with initial 
datum p^. Applying the comparison principle |19[ Lemma 4.2] we have that 

p{t,x) < p{t,x) for a.e. t,x 

and it suffices to show that p{t) > cq > 0. Because 0 and m{x) are 
stationary solutions of (|4.1I) (thus respectively sub and super solutions) the 
comparison principle ensures that 0 < p{t,x) < m{x). Testing (f = 1 m. the 
weak formulation we get ^ p(t) = p(t)(m — p(t)) > 0, whence 

/ p(t) > [ p{t) > [ p =: CO > 0 

Jn Jn Jn 

as desired. 

Since we are considering m{x) > 0 we can apply the generalized Beckner 
inequality, Theorem [8] in the appendix, to get 

^(co) [ \p{t) - < <^ {\\p{t)\\L^Q)) ■ [ \p{t)-m\‘^ 

J Q. J ^ 

< [ (|V(p(t)-m)p-H|p(t)-mp)p(t). 

Jn 

Wence by Lemma 14.21 

—T(t) = —T){t) < —2j£{t) a.e. t > 0 

with 7 = 4>(co). By standard Gronwall arguments we conclude that 

£{t) < e“2^‘T(0), 
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which in turn yields ||p(t) — < e 1^2 and achieves the 

proof. □ 

Remark 4.3. From the proof above it is clear that the rate 7 > 0 in Theo- 
rem\^only depends on the initial datum through cq = min{/? 0 ; w-} dx. 

5. Appendix 

Lower-semicontinuity of the bounded-Lipschitz distance. 

Lemma 5.1. The bounded-Lipschitz distance dsL is sequentially lower semi- 
continuous with respect to the weak-* topology. 

The proof is obvious since the supremum in the definition oi dsL can be 
restricted to smooth compactly supported functions, which are dense in Cq. 

A variant of the Banach-Alaoglu theorem. 

Proposition 5.2. Let {X, || • ||) be a separable normed vector space. Assume 
that there exists a sequence of seminorms {|| • \\k} (k = 0,1,2,... ) on X such 
that for every x € X one has 

\\x\\k < (Tllxll 

with a constant C independent of k, x, and 

||x||fc ||x||o. 

K —>-00 

Let ifk (k = 1,2,...) be a uniformly bounded semence of linear continuous 
functionals on {X, || • ||fc), resp., in the sense tha^ 

Ck ■= ll¥^fc||(x,||-|U)* < C. 

Then the sequence {^Pk'\ admits a converging subsequence cpk^ —>■ <po in the 
weak-* topology of X*, and 

||<i?o||(v,||-||o)* < Co := liininf Cfe. (5.1) 

Proof. Since 

\Tkix)\ < Ck\\x\\k < C\\x\\ 

for every x G X, { 99 ^} is a bounded sequence in X*. Hence, by the Banach- 
Alaoglu theorem, it weakly-* converges to some ipo G X* (up to a subse¬ 
quence). Without loss of generality, —)• cq, and passing to the limit in the 
inequality \(pk{x)\ < Cfc||x||fe we deduce (|5.1I) . □ 

^We recall that the continuous dual of a seminormed space is a Banach space, see [5] 
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Arc-length reparametrization. 

Lemma 5.3. Let po,pi G and (pt, Ui)ig[o^i] be an admissible path con¬ 
necting PojPi with finite energy E[p]u\. Then there exists an admissible 
time reparametrization {pt,Ut)t£[o,i] with energy E\p]u\ < E[p;Vi] such that 
ll^llHi(dPt) = = ^[P'M for a.e. t G [0,1]. 

Proof. The argument is adapted from |3l Lemma 1.1.4], Observing that 
t ||ut||iii-i(dpt) G L^(0,1) C L^(0,1) we see that 

S(i) = / l|lir|bl(dpO^'^ 

Jo 

is absolutely continuous, nondecreasing, and 

s(0) = 0, s(l) = / \\ur\\m{dppdr =: L. 

Jo 

The left-continuous inverse 


[0, L] 9 s t(s) := min{t G [0,1] : s{t) = s} 

is a monotone increasing function, and has therefore countably many jumps. 
Denoting \t := ||ut||p/i(dpt) and observing that ^s(t) = Xt we see that the 
countable set of discontinuities of t(s) is precisely the image by s of the 
critical points {t G [0,1] : Aj = 0}, which by countability has zero ds measure 
in [0, L], As a consequence A^( 5 ) is positive ds a.e. in [0,L], and 

s G [0,L] : ps := p^s) and := t- 

l|Ut(s)lkl(dp,(3)) 

are well-defined and measurable in s with of course 


llaslli^qdpD “ ^ ^ 

Exploiting the narrow continuity of t i—^ pf and s(t(s)) = s it is easy to see 
that s I—>• ps is narrowly continuous and connects po,pt in time s G [0, L], 
Furthermore one can check that dsPs + div(psVtis) = PsUs in the sense of 
distributions T>'{{0,L) x M'^), which formally follows by the chain rule 


dsPs 


= T.' 


(- div (pt(s)Vnt(s)) -k Pt{s)Ux[s)) 


l|at(s)ll//i(dpt(s)) 


= - div {psS/us) + PsUs 
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(this can be made rigorous using change of variables and ||itt(s) > 0 

a.e. s G [0,L], see e.g. |3l Lemma 8.1.3]). 

As a consequence (ps,Us)sg[o,L] is an admissible curve connecting po,pi with 
energy 

pL pL 

^[p;u] = y^ \\^s\\H^dps)^s = lds = L. 

Setting {pt,Ut)te[o,i] ■= {PtL, LutL)t^[op] in order to connect in time t G [0,1], 
the energy hnally scales according to Remark 12.31 as 

E[p-M = L.E[p]u] = 

= llntlbi(dpLd^) ^ ht\\m(dpt)^* = E{p-,u] 
as desired. By construction we have that || 

out\\H^{dp^) is constant in time, and because ||ut||^i^^_jdt = E\p]u\ we 
conclude that ||ut||^i^j- ^ = £'[p;u] a.e. t G (0,1) and the proof is complete. 

□ 

Lower-semicontinuous translation of the Hopf-Rinow theorem. 

Lemma 5.4. Let a metric space {X, g) be a complete length space. As¬ 
sume that there exists a g-boundedly compact Hausdorff topology a on X (i-e 
Q-bounded sequences contain a-converging subsequences) such that g is se¬ 
quentially lower semicontinuous with respect to a. Then (X, g) is a geodesic 
space. 

Proof. Fix any two points x,y € X. By |8l Theorem 2.4.16(1)], it suffices to 
show that they admit a midpoint, i-e a point 2 : such that 

g{x,y) = 2g{x,z) = 2g{z,y). 

By 0 Lemma 2.4.10], there exists a sequence Zk of almost midpoints, i-e 

\q{x, y) - 2g{x, Zk)\ < k~^, \gix, y) - 2g{y, Zk)\ < k~^. 

The sequence {zk} is ^p-irounded, thus without loss of generality it cr-converges 
to some z G X. Then 

2g{x,z) < lim 2g{x,Zk) = g{x,y), 

k^oo 

‘^Q{y,z) < lim 2g{y,Zk) = g{x,y). 

/e—>-oo 

But its is clear from the triangle inequality that the latter inequalities must 
be equalities. □ 
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A generalized Beckner inequality. All the integrals below are implicitly 
computed with respect to the Lebesgue measure dx. 

Theorem 8. Let O in be a bounded, connected, open domain, satisfy¬ 
ing the cone property. Let m : O —)• M 6e a Lipschitz function such that 
inf m{x) > 0. There exists a strictly increasing continuous scalar function 

(depending merely on Ll and m) such that d>(0) = 0 and 

^ ( f pj [ \p- mp < [ p\p- mp + [ p\V{p - m)|^ (5.2) 

\Jq J Jfi Jq Jq 

for every non-negative p G H 


Proof. Step 1. Without loss of generality, we may rescale the problem so 
that Ll has Lebesgue measure 1. Assume first that m{x) = 1. Under these 
assumptions, the generalized Beckner inequality m Lemma 4] with p = 3/2, 
g = 4/3 implies 



(5.3) 


for every non-negative p G H L°°{Q), where Cci depends only on U. Let 
A = Jq p > 0. If A = 0 then (15.21) trivially holds with <h(0) = 0. If A = 1, 
then \\p\\L2{n) > 1; and (lOl) yields 


[ \p-l\^<Cn [ p\Vp\\ 

Jq Jq 

which is even stronger than (15.2p . 

Step 2. We now consider the case of arbitrary A > 0. We set ‘1>(A) : = 
min ^A, Let p\ = p/\. Since we rescaled |n| = 1, by the Holder 

inequality we have that 

IIPA||L3(f^) > \\p\\\L^(n) > IIPAlliqn) = 
so 

ll/^A|li3(f^) > \\P\\\\,2(y^y 

Then we discover that 

[ |p-l|2= [ \Xpx-l\^ = X^( [ |paP)-2A + 1 
Jq Jq \Jq J 

= (A^ — 2A) { f pj\ -I- 1 -I- 2A / \p\ — 1|^ 

\Jn J Jn 
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< A" 


Pa ) “ 2A ( / p\] + 1 + 2\ Ipa - 1| 


= [ Pxl^Px - Ip + 2A / \px - Ip < [ p\\Xp\ - Ip + 2 XCq f pa|Vpa|" 

J Q. 

■ iw ^ 5(A) 

=hIp [X 1 

Step 3. We will now treat the case of generic m{x) > 0 by a perturba¬ 
tion argument. Throughout this step we denote by Cm a generic constant 
depending only on m. Let Pm{x) = p{x)/m{x). Then in the light of the 
previous step we see that 


d> 


( / p) / |p-mp = $( / mpm 

\Jn J Jn \Jn 


< Cm <^ 


/ Pm 

In 

Cl Cm 


I rrP\pm - Ip 

n 

[ I Pm ~ 1| 

In 

Pm I Pm 11 T / Pm \ V Pn 
Jn 


in 


and it suffices to show that the latter sum does not exceed 


Cr, 


/ p|p-mp-F / p|V(p-m)|" 

Jn Jn 


Let a £ (0,1) be such that 
1 


a 


— 1 j sup {m(3:)|Vm(x)p} = inf {m^(x)} . 
x(^n x^n 


Then we find that 



-lp-F(l-a) / m^pmlVpr; 

Jn 

- Ip -h (1 - a) / m^Pm\Vpm 

Jn 

mpm\Pm - IpiVmp 


p 

2 
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<C^ 


= c„ 


/ - Ip + / 

J Q, J Q 

+ 2 - l|Vm • V/9m + / mpm|/Om - l|^|Vm| 

/ m^/9m|/9m - 1|^ + / mpm K^V/Om + Prn^m) - Vm| 

^£1 ^£7 

/ p|p-m|^+ / /9|V(/?-m)| 
in in 


= C„ 


□ 
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